Metaheuristics are popularly used in various fields, and they have attracted much attention in the scientific and industrial communities. In recent years, the number of new metaheuristic names has been continuously growing. Generally, the inventors attribute the novelties of these new algorithms to inspirations from either biology, human behaviors, physics, or other phenomena. In addition, these new algorithms, compared against basic versions of other metaheuristics using classical benchmark problems without shift/rotation, show competitive performances. In this study, we exhaustively tabulate more than 500 metaheuristics. To comparatively evaluate the performance of the recent competitive variants and newly proposed metaheuristics, 11 newly proposed metaheuristics and 4 variants of established metaheuristics are comprehensively compared on the CEC2017 benchmark suite. In addition, whether these algorithms have a search bias to the center of the search space is investigated. The results show that the performance of the newly proposed EBCM (effective butterfly optimizer with covariance matrix adaptation) algorithm performs comparably to the 4 well performing variants of the established metaheuristics and possesses similar properties and behaviors, such as convergence, diversity, exploration and exploitation trade-offs, in many aspects. The performance of all 15 of the algorithms is likely to deteriorate due to certain transformations, while the 4 state-of-the-art metaheuristics are less affected by transformations such as the shifting of the global optimal point away from the center of the search space. It should be noted that, except EBCM, the other 10 new algorithms proposed mostly during 2019-2020 are inferior to the well performing 2017 variants of differential evolution and evolution strategy in terms of convergence speed and global search ability on CEC 2017 functions.
translated by 谷歌翻译
本文研究了深入的增强学习(DRL),以解决多个无人驾驶汽车(UAV)的任务调度问题。当前的方法通常使用精确的启发式算法来解决该问题,而随着任务量表的增长,计算时间迅速增加,并且启发式规则需要手动设计。作为一种自学方法,DRL可以在没有手工设计的规则的情况下快速获得高质量的解决方案。但是,巨大的决策空间使得在大规模任务的情况下,对DRL模型的培训变得不稳定。在这项工作中,为了解决大规模的问题,我们开发了一个基于鸿沟和征服的框架(DCF),以将原始问题与任务分配和无人机路由计划子问题分配,并在上层和下层解决,分别。基于DCF,提出了双层深钢筋学习方法(DL-DRL),其中高层DRL模型被设计为将任务分配给适当的无人机和下层DRL模型[即广泛使用的注意力模型(AM)]应用于生成可行的无人机路由。由于上层模型确定了低层模型的输入数据分布,并且在培训期间通过低层模型计算其奖励,因此我们制定了交互式训练策略(ITS),其中整个训练过程由PRE组成 - 培训,强化培训和替代培训过程。实验结果表明,我们的DL-DRL胜过基于主流学习和大多数传统方法的主体,并且与最新的启发式方法[即OR-Tools]具有竞争力,尤其是在大规模问题上。通过测试针对较大较大的模型学习的模型,还可以验证DL-DRL的巨大概括性。此外,一项消融研究表明,我们的它可以达到模型性能和训练持续时间之间的妥协。
translated by 谷歌翻译
记住和遗忘机制是人类学习记忆系统中同一硬币的两侧。灵感来自人类脑记忆机制,现代机器学习系统一直在努力通过更好地记住终身学习能力的机器,同时推动遗忘为敌人来克服。尽管如此,这个想法可能只能看到半张图片。直到最近,越来越多的研究人员认为,大脑出生忘记,即忘记是抽象,丰富和灵活的陈述的自然和积极的过程。本文通过人工神经网络积极遗忘机制提出了一种学习模型。主动遗忘机制(AFM)通过“即插即用”遗忘层(P \&PF)引入神经网络,由具有内部调节策略(IRS)的抑制神经元组成,以调整自己的消光率通过横向抑制机制和外部调节策略(ERS)通过抑制机制调节兴奋性神经元的消光速率。实验研究表明,P \&PF提供了令人惊讶的益处:自适应结构,强大的泛化,长期学习和记忆,以及对数据和参数扰动的鲁棒性。这项工作阐明了忘记学习过程的重要性,并提供了新的视角,了解神经网络的潜在机制。
translated by 谷歌翻译
复制伪造是对图像的复制和粘贴特定贴片的操纵,并具有潜在的非法或不道德用途。法医伪造的法医方法的最新进展显示出在检测准确性和鲁棒性方面的成功越来越大。但是,对于具有高自相似性或强烈信号损坏的图像,现有算法通常会表现出效率低下的过程和不可靠的结果。这主要是由于低级视觉表示与高级语义概念之间的固有语义差距。在本文中,我们提出了一项最初的研究,该研究试图减轻复制移动伪造检测中的语义差距问题,并通过空间汇集当地矩不变式以用于中级图像表示。我们的检测方法将传统作品扩展到两个方面:1)我们首次将视野模型介绍到该领域,这可能意味着法医研究的新观点; 2)我们提出了一个单词到短语功能描述和匹配管道,涵盖了数字图像的空间结构和视觉显着性信息。广泛的实验结果表明,在克服语义差距引起的相关问题时,我们的框架优于最先进的算法。
translated by 谷歌翻译
我们为Covid-19的快速准确CT(DL-FACT)测试提供了一系列深度学习的计算框架。我们开发了基于CT的DL框架,通过基于DL的CT图像增强和分类来提高Covid-19(加上其变体)的测试速度和准确性。图像增强网络适用于DDNet,短暂的Dennet和基于Deconvolulate的网络。为了展示其速度和准确性,我们在Covid-19 CT图像的几个来源中评估了DL-FARE。我们的结果表明,DL-FACT可以显着缩短几天到几天的周转时间,并提高Covid-19测试精度高达91%。DL-FACT可以用作诊断和监测Covid-19的医学专业人员的软件工具。
translated by 谷歌翻译
A recent study has shown a phenomenon called neural collapse in that the within-class means of features and the classifier weight vectors converge to the vertices of a simplex equiangular tight frame at the terminal phase of training for classification. In this paper, we explore the corresponding structures of the last-layer feature centers and classifiers in semantic segmentation. Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers. However, such a symmetric structure is beneficial to discrimination for the minor classes. To preserve these advantages, we introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure in imbalanced semantic segmentation. Experimental results show that our method can bring significant improvements on both 2D and 3D semantic segmentation benchmarks. Moreover, our method ranks 1st and sets a new record (+6.8% mIoU) on the ScanNet200 test leaderboard. Code will be available at https://github.com/dvlab-research/Imbalanced-Learning.
translated by 谷歌翻译
Weakly-supervised object localization aims to indicate the category as well as the scope of an object in an image given only the image-level labels. Most of the existing works are based on Class Activation Mapping (CAM) and endeavor to enlarge the discriminative area inside the activation map to perceive the whole object, yet ignore the co-occurrence confounder of the object and context (e.g., fish and water), which makes the model inspection hard to distinguish object boundaries. Besides, the use of CAM also brings a dilemma problem that the classification and localization always suffer from a performance gap and can not reach their highest accuracy simultaneously. In this paper, we propose a casual knowledge distillation method, dubbed KD-CI-CAM, to address these two under-explored issues in one go. More specifically, we tackle the co-occurrence context confounder problem via causal intervention (CI), which explores the causalities among image features, contexts, and categories to eliminate the biased object-context entanglement in the class activation maps. Based on the de-biased object feature, we additionally propose a multi-teacher causal distillation framework to balance the absorption of classification knowledge and localization knowledge during model training. Extensive experiments on several benchmarks demonstrate the effectiveness of KD-CI-CAM in learning clear object boundaries from confounding contexts and addressing the dilemma problem between classification and localization performance.
translated by 谷歌翻译
Witnessing the impressive achievements of pre-training techniques on large-scale data in the field of computer vision and natural language processing, we wonder whether this idea could be adapted in a grab-and-go spirit, and mitigate the sample inefficiency problem for visuomotor driving. Given the highly dynamic and variant nature of the input, the visuomotor driving task inherently lacks view and translation invariance, and the visual input contains massive irrelevant information for decision making, resulting in predominant pre-training approaches from general vision less suitable for the autonomous driving task. To this end, we propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving. We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos. The proposed PPGeo is performed in two stages to support effective self-supervised training. In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input. In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only. As such, the pre-trained visual encoder is equipped with rich driving policy related representations and thereby competent for multiple visuomotor driving tasks. Extensive experiments covering a wide span of challenging scenarios have demonstrated the superiority of our proposed approach, where improvements range from 2% to even over 100% with very limited data. Code and models will be available at https://github.com/OpenDriveLab/PPGeo.
translated by 谷歌翻译
In this work, we focus on instance-level open vocabulary segmentation, intending to expand a segmenter for instance-wise novel categories without mask annotations. We investigate a simple yet effective framework with the help of image captions, focusing on exploiting thousands of object nouns in captions to discover instances of novel classes. Rather than adopting pretrained caption models or using massive caption datasets with complex pipelines, we propose an end-to-end solution from two aspects: caption grounding and caption generation. In particular, we devise a joint Caption Grounding and Generation (CGG) framework based on a Mask Transformer baseline. The framework has a novel grounding loss that performs explicit and implicit multi-modal feature alignments. We further design a lightweight caption generation head to allow for additional caption supervision. We find that grounding and generation complement each other, significantly enhancing the segmentation performance for novel categories. We conduct extensive experiments on the COCO dataset with two settings: Open Vocabulary Instance Segmentation (OVIS) and Open Set Panoptic Segmentation (OSPS). The results demonstrate the superiority of our CGG framework over previous OVIS methods, achieving a large improvement of 6.8% mAP on novel classes without extra caption data. Our method also achieves over 15% PQ improvements for novel classes on the OSPS benchmark under various settings.
translated by 谷歌翻译
Nearest-Neighbor (NN) classification has been proven as a simple and effective approach for few-shot learning. The query data can be classified efficiently by finding the nearest support class based on features extracted by pretrained deep models. However, NN-based methods are sensitive to the data distribution and may produce false prediction if the samples in the support set happen to lie around the distribution boundary of different classes. To solve this issue, we present P3DC-Shot, an improved nearest-neighbor based few-shot classification method empowered by prior-driven data calibration. Inspired by the distribution calibration technique which utilizes the distribution or statistics of the base classes to calibrate the data for few-shot tasks, we propose a novel discrete data calibration operation which is more suitable for NN-based few-shot classification. Specifically, we treat the prototypes representing each base class as priors and calibrate each support data based on its similarity to different base prototypes. Then, we perform NN classification using these discretely calibrated support data. Results from extensive experiments on various datasets show our efficient non-learning based method can outperform or at least comparable to SOTA methods which need additional learning steps.
translated by 谷歌翻译